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ABSTRACT 


The tools used in survival analysis are the Kaplan- 
Meier Estimator, a non-parametric statistic, and the Cox 
Proportional Hazard method. The Kaplan-Meier method 
estimates the survival curve taking into account censored 
data. Cox Proportional Hazard results include total 
values/censored values, covariate non-parametric estimate, 
standard error, chi-square statistic, P-value, and hazard 
ratio. We used the Mayo Clinic study of 418 Primary 
Biliary Cirrhosis patients during a ten-year period. In 
using these methods we found that the Kaplan-Meier 
survival curves were significantly different between the 
groups. Kaplan-Meier results include total values/censored 
values. 

The results indicate that drugs did not have a major 
difference on the outcome of the tests. Gender was the 


substantial determining factor. 
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CHAPTER ONE 


INTRODUCTION TO SURVIVAL ANALYSIS 


The term “survival analysis” pertains to a 
Statistical approach designed to take into account the 
amount of time an experimental unit contributes toa 
study. That is, it is the study of time between entry into 
observation and a subsequent event. In survival analysis 
we observe the length of time from a starting point (such 
as the date of a hospital admission) until the occurrence 
of an endpoint event (such as death), often referred to as 
a “failure.” A key characteristic of survival analysis is 
the inclusion of partially missing (so-called “censored” ) 
data. For example, if a woman is alive at study’s end we 
do not know how long she is going to live; however if her 
start point occurred 180 days earlier, we do know that her 
survival time is at least 180 days. Loss to follow-up, and 
“closing the files” when a study ends are common censoring 
events. 

There are two aspects of survival analysis that make 
it interesting from a data analysis perspective which are: 

Te The response variable, time to failure, is 


usually not normally distributed. 


2. Survival analysis often involves censored data. 

Originally the event of interest was death, hence the 
term, “survival analysis.” The analysis consisted of 
following the subject until death. The uses in the 
survival analysis of today vary quite a bit. Applications 
now include time until onset of disease, employment, 
equipment failure, earthquake, and so on. The best way to 
define such events is simply to realize that these events 
are a transition from one discrete state to another at an 
instantaneous moment in time. Of course, the term 
“instantaneous”, which may be years, months, days, 
minutes, or seconds, is relative and has only the 
boundaries set by the researcher. 

The origin of survival analysis goes back to 
mortality tables from centuries ago. However, it was not 
until World War II that a new era of survival analysis 
emerged (See,[8]). This new era was stimulated by interest 
in reliability (or failure time) of military equipment. At 
the end of the war these newly developed statistical 
methods emerging from strict mortality data research were 
applied to failure time research, and quickly spread 
through private industry as customers became more 


demanding of safer, more reliable products. As the uses of 


survival analysis grew, parametric models gave way to 
nonparametric and semi parametric approaches because of 
their appeal in dealing with the ever-growing field of 
clinical trials in medical research. Survival analysis was 
well suited for such work because medical intervention 
follow-up studies could start without all experimental 
units enrolled at the start of the observation time and 
could end before all experimental units had experienced an 
event. This is extremely important because even in the 
best-developed studies there will be subjects who choose 
to quit participating, who move too far away to follow, or 
who will die from some unrelated event. The researcher was 
no longer forced to withdraw the experimental ante and all 
associating data from the study; instead techniques called 
censoring enable researchers to analyze incomplete data 
due to delayed entry or withdrawal from the study. This 
was important in allowing each experimental unit to 
contribute all of the information possible to the model 
for the amount of time the researcher was able to observe 
the unit. 

Current software packages and high performance 


computers now make applying survival analysis techniques 


easier to solve because of their computationally intensive 
algorithms. 

Some of the tools used in survival analysis are the 
cumulative distribution function F(t), the probability 
density function f(t), the survival fateeen S@), and the 
hazard function, A(t). The survival function data is 
generally described and modeled in terms of two related 
functions, the survivor function and hazard function. The 
survivor function, S(t), represents the probability that an 
individual survives from the time origin to some time 
beyond ¢t, it is positive and ranges from 0 to 1. It is 
defined as S(0)=1 and as t approaches o, S(t) approaches 0. 
The survivor function can be estimated non-parametrically 
from observed data, both censored and uncensored, using 
the Kaplan-Meier method. This method is also called the 
product-limit method and is based on maximum likelihood 
estimation. Suppose deaths occur at times f/f, <f,...<t)...<¢,. 
The Kaplan-Meier estimator is the estimator used by most 
software packages because of the simplistic step idea. The 
Kaplan-Meier estimator incorporates information from all 


of the observations available, by considering any point in 


time as a series of steps defined by the observed survival 
and censored times. 
t 
S(t) = p(T >t) =1- F(t) =1- i f(u)du 
u=0 
The above survival curve describes the relationship 
between the probability of survival and time. 
The cumulative distribution function is very useful 
in describing the continuous probability distribution of a 
random variable, such as time, in a survival analysis. The 
cumulative distribution function of a random variable T, 
denoted by F(t), is defined by F.()=P,(7 St). This is 
interpreted as a function that will give the probability 
that the variable T will be less than or equal to any value 
t that we choose. Several properties of a distribution 
function F(t) can be listed as a consequence of the 
knowledge of probabilities. Note that F(t) has the 
probability 0 < F(t) < 1, and F(t) is a non-decreasing 
function of ¢, and as tapproaches o, F(t) approaches 1. 
The resulting function is also called the survivorship or 
survival function. The hazard function A(t#)is given by the 
following: 


A(t) = Pit <T <(t+A)|T>h= fOM-FO) = fO/SO 


The hazard function describes the concept of the risk of 
an outcome (e.g., death, failure, hospitalization) in an 
interval after time t, conditional on the subject having 
survived to time ¢t. It is the probability that an 
individual dies at somewhere between ¢ and t+A, divided by 
the probability that the individual survived beyond time ¢. 
The hazard function seems to be more intuitive to use in 
survival analysis than the probability density function 
because it attempts to quantify the instantaneous risk 
that an event will take place at time ¢ given that the 
subject survived to time t¢ (See, [8], [9]). 

The survivor function and hazard function can be 

estimated from observed data. If the form of F(t) is not 
specified then non-parametric procedures can be used, 
otherwise parametric models can be fitted to the data. 
The probability density function is also very useful in 
describing the continuous probability distribution of a 
random variable. Every continuous random variable has its 
own density function, the probability P(a<T<b) is the area 
under the curve between times a and b. 

Senaayexe or incomplete data in survival analysis 


experiments are designed for a shorter period of time 


only, and, have to account for the lost observations. If 
we observe a sample only for a short period of time, we 
only know that some individuals were alive at the end of 
the survey and no information on their exact time of death 
is available. Similarly if observations are lost during 
the experiment, all we know is that these individuals were 
Still alive at some stage and no information on their 
exact time of death is available. 

Data are called right-censored if the current survey 
ends at a fixed date known in advance. If the event of 
interest happens after this date, the observation is 
censored. All we know in this case is that the event might 
have happened after the end of the survey. Data are called 
left-censored if no information on the date at which the 
event of interest occurred is available. All we know in 
this case is that a certain disease occurred before the 
examination. Survival in two or more groups of patients 
can be compared using a non-parametric test such as the 
log-rank test, also called the Mantel-Cox test. This is 
the most widely used method of comparing survival curves. 

There are several reasons Cox’s proportional hazards 
modeling was chosen to explain the effect of covariates on 


time until event. They are the relative risk non 


parametric assumptions, the use of the partial likelihood 
function, and the creation of survivor function estimates. 

The non-parametric tests for comparing survival in 
the Mantel-Cox method essentially calculate at each death 
time, for each treatment group, the expected number of 
deaths under the null hypothesis of no difference between 
groups. These are then summed to give the total expected 
number of deaths in each treatment group, say E, for 
treatment group i. The log-rank test for data compares the 
observed number of deaths in each treatment group, say O, 
for treatment group 7, to the expected number by 


calculating the test statistic 
»_ &O;-Ei)’ 
pay Oey 
i=l Ej 


and comparing it to a chi-square distribution with g-1 
degrees of freedom, where g is the number of treatment 
groups. 

Nonparametric methods provide an alternative series 
of eveeaner ead methods that require no or very limited 
assumptions to be made about different circumstances. Some 
of the more commonly used are the nonparametric 
alternatives to the t-tests, and it is these that are 


covered in the present review. 


EPI-Info™ version 3.3.2 is the software package used 
in Chapters 3 and 4, especially for box Proportional 
Hazard. EPI-Info is a public domain software package 
designed for the global community of public health 
practitioners and researchers. It provides for easy form 
and database construction, data entry, and analysis with 
epidemiologic statistics, maps, and graphs. Minitab 14 was 
used in Chapter 1 and 2 for Kaplan-Meier Estimator 


(See, [14]). 


CHAPTER TWO 


KAPLAN-MEIER ESTIMATOR 


The Kaplan-Meier estimate is a simple way to compute 
the survival curve. It involves computing the number of 
people who died at a certain time point, divided by the 
number of people who were still in the study at that time. 
These probabilities are multiplied by any earlier computed 
probabilities, which is one reason this is called a 
“product limit estimate.” The Kaplan-Meier survival curve 
is often illustrated graphically. It looks like a poorly 
designed staircase, with vertical steps downward at the 
time of death of each individual subject (See Appendix D). 

Often we will compare curves for two different groups 
of subjects. For example, the survival pattern for 
subjects on a standard therapy may be compared to a newer 
therapy. We can look for gaps in these curves ina 
horizontal or vertical direction. A vertical gap means 
that at a specific time point, one group had a greater 
fraction of subjects surviving. A horizontal gap means 
that it took longer for one group to experience a certain 


fraction of deaths. 
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To compute a survival curve, we need to note the time 
of occurrence of events (e.g., failures, deaths) and let 
. ( 


t,,t,,t;,.. represent the times when a death or failure occurs. 


It is possible for two or more events to occur at the same 
time, in which case the number of distinee times is less 
than the number of deaths or failures. We need to place 
the t’s in order from smallest to largest, that is, 

t, <t, <i, < 

We also need to define the starting point of the study, 
1,=0. The basic computations for the Kaplan-Meier survival 


curve rely on the computation of conditional survival 


probabilities. In particular, the probability PIT >t,|T >t,.] 


which can be interpreted as the probability of a subject 
survival toa spacieie-eiiie: given that the subject 
survived to the previous time. This probability te easy to 
calculate if we know phe number of aeetns or failures at a 
specific baie and eer iow iS Huaey of patients 
at risk at that time. 

A more difficult (but more important) probability is 
the unconditional probability of survival, P[f2t,] which 


represents the simple probability of survival toa 
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Armed with this information we can now compute a Kaplan- 
Meier survival curve. First we need to calculate the 
number of patients at risk, n,=n_,-d,,-c¢_,. In other words, 
the number at risk at any specific time point is just the 
number at risk at the previous time point, minus the 
number of deaths/failures and the number of censored 


observations. For convenience, we define n, to be the total 
number of patients in the study, « to be the number of 
censored observations prior to the first death or failure, 
and d,=0. Next we compute the estimate of the conditional 


probability of survival: (See, [1], [9]). 


Pires\ree,j=1-©. 
HA. 


t 


Finally, the unconditional probability of survival is 
simply the cumulative product of the conditional 
probabilities. 


i d. 
PIT =t,J=] ]}1-— |} 
jal Hy 


Jj 


Censoring 
Censoring is a key concept for survival analysis. 


Censoring is a form of missing data. In an experiment in 


AES 


which subjects are followed over time until an event of 
interest (such as death or other type of failure) occurs, 
it is not always possible to follow every subject until 
the event is observed. An event is usually death (but 
other events used in the literature include hospital 
discharge, development of a disease, and relapse of a 
malignancy). The event is also referred to as a failure. 
Subjects may drop out of the study and be lost to follow- 
up, or be deliberately withdrawn, or the end of the data 
collection period may arrive before the event is observed 
to happen. For such a subject, all that is known is that 
the time to the event was at least as long as the time to 
when the subject was last observed. The observed time to 
the event under such circumstances is censored. Survival 
analysis methods generally allow for censored data. 
Censoring may occur from the right (observation stops 
before the event is observed) as in censorship for 
survival analysis, or from the left (observation does not 
begin until after the event has occurred). 

Suppose that the following Primary Biliary Cirrhosis 


data are observed from 15 (n=15) with Platelets. Seven 
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patients relapse at 9.7, 10.3, 10.6, 11, 12, 12.2, 13.6, 
months. 

The Kaplan-Meier estimates can be calculated by 
constructing a table with five columns following the 
outline below. 

La Column 1 contains all the survival time, both 
censored and uncensored in order from largest to 
smallest. 

2% The second column, labeled i, consists of the 
corresponding rank of each observation in 
column 1. 

3. The third column, labeled r, pertains to 


uncensored observations only. Let r=i. 


4, Compute (n-r)/(n—r+1), or p,;, for every uncensored 


observation ¢,, in column 4 to give the 


@® 
proportion of patients surviving up to and then 


through Li: 


Bs In column 5, KY() is the product of all values of 
(n-r)/n-r+1) up to and including t. If some 
uncensored observations are ties, the smallest 


So should be used. 


LS 


To summarize this procedure, let a be the total number of 
patients whose survival times, censored or not, are 


available. Re-label the n survival times in order of 


increasing magnitude such that fj <t.) S...<t,). Then 


rs n—-r ree ; 
SO ra, where r runs through those positive integers 
n-rt 
Kr) 


for which ¢,,<t and t,) is uncensored. The values of r are 
consecutive integers 12.,..,.7 if there are no censored 
observations; if there are censored observations, they are 
not counted. The estimated median survival time is 50 
percentile, which is the value of ¢ at S(t) = 0.50. See 
Appendix B for an example of the calculation of a Kaplan- 
Meier estimate. For calculations by Minitab (see Appendix 


C) and for graph of Kaplan-Meier regarding survival curves 


of genders, (see Appendix D). 


Log - Rank Test 
Often it is of interest to determine whether two or 
more samples could have arisen from identical survivor 
functions. One approach would involve the use of the 
asymptotic results for F(t) mentioned above to devise a 


test for equality of the survivor functions at some pre- 
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specified time ¢. Such a procedure, however, would not 
usually make efficient use of the available data, and 
attention in recent years has turned instead to test 
statistics that attempt to summarize differences between 
survivor function estimators over the whole of the study 
period. The log-rank test is particularly good when the 
ratio of hazard functions in the populations being 
compared is approximately constant. It can also be 
advocated on the basis of ease of presentation to non- 
statistical personnel since the test statistic is the 
difference between the observed number of failures in each 
group. It is a quantity that, for most purposes, can be 
thought of as the corresponding Beeanees number of 
failures under the null hypothesis (See, [2], [4]). 
Suppose one wishes to test the equality of the 
survivor functions F,(¢),.../,(f)on the basis of samples from 
each of r populations. Let 1f, <t, <..,<¢, denote the failure 
times for the sample formed by pooling the rindividual 


samples. Suppose d, failures occur at t,and the n,; study 
subjects are at risk just prior to ¢,(j=l,...4) and let d,and 


n,be the corresponding numbers in sample i(i=1,...,.r). The 
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data at ¢, are in the form of a 2xr contingency table with 


d, failures and n,—-d, survivors in the ith row (i =1,..,r) . 


7] 
Conditional on the failure and censoring experience up to 


time ¢,the distribution of d,,,..,.d, is simply the product of 


gq 


binomial distributions 


a) ty dad— yrds 
I] d;(1-A,) (F=1) 
d, 


where A,is the conditional failure probability at ¢, which 


is common for each of the r samples under the null 


hypothesis. The conditional distribution for d,,,...d, given 


d, is then the hyper-geometric distribution 


The mean and variance of d, from (2.1) are, respectively, 
_ -1 = -2 -1 
w, =n,djn, and (V,), =n,(,—ny)d,(1,-d,)n,; (a, -) 
, , —2 - 
The covariance of d, and d, is (V,), =-n,n,d,(1,-d,)n, (n,-1)”. 


Thus the statistic vj =(dy —W,,.ud, —W,) has (conditional) 


mean zero and variance matrix Vi» where the prime denotes 
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vector transpose. See Appendix E for an example of the 


log-rank test. 
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CHAPTER THREE 


COX PROPORTIONAL HAZARD REGRESSION MODEL 


The Cox proportional Hazard model is probably the 
most widely used method for modeling survival data. For 
data with one explanatory variable, i.e. one covariate, 
non-parametric methods like plotting Kaplan-Meier survival 
probabilities may be adequate if the groups being compared 
are reasonably similar. Frequently however, the groups 
being compared differ in many respects. They may have 
different age distributions, different proportions of men 
and women, different smoking habits etc. These differences 
come in addition to the covariates we are really 
interested in, and the analysis must be adjusted to 
compensate for these other differences, which may 
otherwise confound the analysis. The Cox proportional 
hazards model is a semi-parametric model for fitting 


survival data. The basic model is as follows: 
h(t|Z) = hy()-exp(B‘2Z) 
where /A,(t) 1s the baseline hazard which may vary 


arbitrarily over time, and z is the covariate vector. The 


covariates may be time-dependent but are fixed at the 
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start of the study. The vector #=(f,....f,) is a vector of 
covariate coefficients. The baseline hazard is treated 
non-parametrically, but the individual covariate effects 


(Z,) are assumed to be constant throughout the study. The 


model is often called the proportional hazards model 
because of this constant covariate effect throughout the 
study. If two individuals are compared that have covariate 
values Z and Z* the ratio of their hazard rates at any 
time point simplifies to 
Pp 
h(t|Z) > hy(t)expl> BZ; ] p 


~=——_Ei_— = ep) 4, (Z, - 29) 
MZ) emt ezs) 


This ratio is constant or “proportional” throughout the 
study. This assumption greatly facilitates the 
interpretation of covariate effects, as the effect of a 
given covariate compared to the absence of that covariate 
is expressed as a single constant. This does not however 
imply that the absolute difference between the two 
individuals discussed above is constant; the exponentiated 
covariates act multiplicatively on a baseline hazard which 


may vary freely (See, [3]). 


2. 


Cox Model with Several Covariates 
Fitting of the multivariate Cox proportional hazards 
model would be conducted by starting with a model with all 
variables listed above. One by one, the least significant 
variable would be removed until only significant variables 
remained in the model. Data for overall survival was 


modeled in the same way. (See,[5]). 


The Assumption of Proportional Hazards 

Since the Cox proportional hazards model relies on 
the hazards to be proportional, i.e. that the effect of 5 
given covariate does not change over time, it is very 
important to verify that the covariates satisfy the 
assumption of proportionality. If the assumption is 
violated, the simple Cox model is invalid, and more 
sophisticated analyses are required. If the interest 


centers upon a binary covariate, Z, whose relative risk 


changes over time, one approach is to introduce a time- 
dependent covariate as follows. Let 
Z, if the covariate Z, takes on the value 1 


Z,(t)= 
0 if the covariate Z, takes on the value 0, 
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where g(t) is a known function of time. In such cases, it 
may be preferable to use a procedure that would allow the 
function g(t) to be estimated from the data. One approach 


to this problem is to fit a model with an indicator 


function for g(t). In the simplest approach, define a time- 


Zi ift>t 


dependent covariate Z,(f)= : 
0 if t<t 


To determine the optimal value of +t, the model 
including the new covariate z,(t) is fitted for a set of 


values for t,-and the value of the maximized log partial 
likelihood is the optimal value to use. Proportional 
hazards can, then, be tested for each region and if it 
fails, for t on either side of t then this process can be 
repeated in that region. 

The assessment of the proportional hazards assumption 
can be done numerically or graphically. A great number of 
procedures have been proposed over the years. Some of the 
procedures require partitioning of failure time, some 
require categorization of covariates, some include a 
spline function, and some can be applied to the 
untransformed dataset. None of the methods, either 


numerical or graphical, are today known to be better than 
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the others in finding out whether - the hazards are 
proportional or not. Some authors recommend using 
numerical tests and others recommend graphical procedures 
since they believe that the proportional hazards 
assumption only approximates the correct model for a 
covariate and that any formal test, based on a large 
enough sample, will reject the null hypothesis of 


proportionality. 


Maximum Likelihood 
The likelihood and log-likelihood functions are the 
basis for deriving estimators for parameters, given data. 
While the shapes of these two functions are different, 
they have their maximum point at the same value. In fact, 
the value of @ that corresponds to this maximum point is 


defined as the Maximum Likelihood Estimate and that value 


is denoted as 6. This is the value that is “most likely” 
relative to the other values. This is a simple, concept 


and it has a host of good statistical properties. Thus, in 


general, we seek 6 such that this value maximizes the log- 


likelihood function (See, [4], [7]). 
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Generally, the calculus is used to find the maximum 
point of the fog LiReiineod function and obtain Maximum 
Likelihood Estimations in closed form. This is tedious and 
often not useful in real problems (where closed form 
estimator may often not even exist). The log-likelihood 
functions we will see have a single mode or maximum point 
and no local optima. These conditions make the use of 
numerical methods appealing and efficient. (See, [6]). 

Consider first, the binomial model with a single 
unknown parameter, @. Using calculus one could take the 
first partial derivative of the log-likelihood function 


with respect to the @, set it to zero and solve for @. 


This solution will give 6, the Maximum Likelihood 


Estimation. This value of 6, is the one that maximizes the 
likelihood function. It is the value of the parameter that 
is most likely, given the data. 

The likelihood function provides information on the 
relative likelihood of various parameter values, given the 
data and the model (here, a binomial). Think of 10 of your 
friends, 9 of which have one raffle ticket, while the 10™ 
friend who has 4 tickets, has a higher likelihood of 


winning relative to the other 9 friends. If you were to 
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try to select the most likely winner of the raffle, which 
person would you pick? Most would select the person with 4 
tickets. Now, what if 8 people had a single ticket, one 
had 4 tickets, but the last had 80 tickets. Surely the 
person with 80 tickets is most likely to win (but not with 
certainty). In this simple example you have a feeling 
about the “strength of evidence” about the likely winner. 
In the first case, one person has an edge, but not much 
more. In the second case, the person with the 80 tickets 
is relatively very likely to win. 

The shape of the log-likelihood function is important 
in a conceptual way to the raffle ticket example. If the 
log-likelihood function is relatively flat, one can make 
the interpretation that several (perhaps many) values of p 
are nearly equally likely. They are somewhat alike; this 
is quantified as the sampling variance or standard error. 
If the log-likelihood function is fairly flat, this 
implies considerable uncertainty and this is reflected in 
large sampling variances and standard errors, and wide 
confidence intervals. On the other hand, if the log- 
likelihood function is fairly peaked near its maximum 


point, this indicates some values of p are relatively very 
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likely compared to others (like the person with 80 raffle 
tickets). There is some considerable degree of certainty 
implied and this is reflected in small sampling variances 
and standard errors, and narrow confidence intervals. So, 
the log-likelihood function at its maximum point is 
important as well as the shape of the function near this 
maximum point. 

The shape of the likelihood function near the maximum 
point can be measured by the analytical second partial 
derivatives and these can be closely approximated 
numerically by a computer. Such numerical derivatives are 
important in complicated problems where the log-likelihood 
exists in 20-60 dimensions. This method’s advantage is 
that 
maximum likelihood provides a consistent approach to 
parameter estimation problems. This means that maximum 
likelihood estimates can be developed for a large variety 
of estimation situations. For example, they can be applied 
in reliability analysis to censored data under various 
censoring models (See, [10], [11]). 

Maximum likelihood methods have desirable 


mathematical and optimality properties. Specifically, 


27 


They become minimum variance unbiased estimators 
as the sample size increases. By unbiased, we 
mean that if we take (infinitely many number of) 
random samples with replacement from a 
population, the average value of the parameter 
estimates will be theoretically exactly equal to 
the population, the average value of the 
parameter estimates will be theoretically 
exactly equal to the population value. By 
minimum variance, we mean that the estimator has 
a smallest variance, and thus the narrowest 
confidence interval, of all estimators of that 
type. 

They have approximate normal distributions and 
approximate sample variances that can be used to 
generate confidence bounds and hypothesis tests 


for the parameters. 


Several popular statistical software packages provide 


excellent algorithms for maximum likelihood estimates for 


many of the commonly used distributions. This helps 


mitigate the computational complexity of maximum 


likelihood estimation. This method’s disadvantage is that, 


the likelihood equations need to be specifically worked 
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out for a given distribution and estimation problem. The 
mathematics is often snaeeeig fail, oaaeegnaeiy if 
confidence intervals for the parameters are desired. 

The numerical estimation is usually non-trivial. 
Except for a few cases where the maximum likelihood 
formulas are in fact simple, it is generally best to rely 
on high quality statistical software to obtain maximum 
likelihood estimates. Fortunately, high quality maximum 
likelihood software is becoming increasingly common. 

Maximum likelihood estimates can be heavily biased 
for small samples. The optimality properties may not apply 
for small samples. Maximum likelihood can be sensitive to 


the choice of starting values. 


Partial Likelihood 
To obtain estimates of the covariate parameters, Cox 
developed a nonparametric method he called partial 
likelihood. Estimation of the parameter values is then 
obtained by use of maximum partial likelihood estimation. 
The partial likelihood method based on this assumption is 


related to fh, being undetermined. The intervals between 


successive duration times (or failure times) contribute no 
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information regarding the relationship between the 
covariates and the hazard rate. 

This is in contrast to the parametric methods, where the 
actual survival times are used in the construction of the 
likelihood function. Because the Cox model only uses 


“part” of the available data (fA(#)is not estimated), the 


likelihood function for the Cox model is r “partial” 
likelihood function, hence the name. To get a sense for 
how this works, look at the logic underlying the partial 
likelihood method. Consider the data in Appendix F. Here 
are the survival times for fifteen cases. Of these fifteen 
cases four of them are right-censored and coded 1. All the 
tables in the Appendix, O represents male, 1 represents 


female. 


In the Appendix F table, the first case for ¢#, occurs 
at 51 follow up days, ¢t, occurs at 264 follow up days, f, 
occurs at 611 follow up days, ¢, occurs at 762, ¢t; occurs 
at 1012 follow up days, ¢, occurs at 1217 follow up days, 
t, occurs at 1427 follow up days, t, occurs at 2466 follow 


up days, #, occurs at 2689 follow up days, t,, occurs at 
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4079 follow up days, ¢,, occurs at 4191 follow up days, 


tsstinntig, and 4, are censored (See Appendix G). 


e Events can be ordered. 

° At ¢, all cases are at risk of failing. 

e After the first failure, the risk set decreases 
by 1. 

® The risk set successively dwindles as events 
occur. 


To motivate the partial likelihood estimator, let 
yw =exp(f'x,). The partial likelihood function for these data 


would be equivalent to: 


wa) |} ¥@2) |) ¥G) | _¥® |} _ YO |) _#~ 
SP) | | VP] | WYO | PG) | | VG] | PO) 


(8) P(9) ‘P(11) (14) (15) 


ts | }as yas |} as |] as 

PG PPG) | PG eG) | eG) 

jx8 j=9 jell j=l4 jails 
For a similar illustrative calculation see Appendix G. In 
words, this tells us that each of the fifteen cases is at 
risk of experiencing an event up to the first failure 


time, ¢,. After the first failure in the data set, the risk 


ok 


set decreases in size by 1; thus, the risk set up to the 


second failure time, ¢,, includes all cases. By the fourth 


failure time in the data, #4,, the risk set includes only 
cases 5, 10, 12, and 13. By the last failure time, only 
case 13 remains in the risk set. This exercise shows that 
the partial likelihood function is solely based on the 
ordered duration times, and not on the length of the 
interval between duration times. Also, censored 
observations contribute information to the “risk set,” 
that is, cases that are surviving to time ¢,, but 
contribute no information regarding failure times. To be 
more formal, suppose we have a data set with an 
observations and k distinct failure (event) times. Cox 
estimation first proceeds by sorting the ordered failure 
times, such that 


LAE Sos, Shs 
where ¢,; denotes the failure time for the ith individual. 
For censored cases, we define 6, to be 1 if the case is 


right-censored, and 0 if the case is uncensored. Finally, 
the ordered event times are modeled as a function of 


covariates,x. 
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The partial likelihood function is derived by taking 
the product of the conditional probability of a failure at 


time ¢,, given the number of cases that are at risk of 


failing at time ¢,. That is to say, given that some event 
has occurred, what is the probability the event occurred 
to the ith individual from a risk set of size nu? More 


formally, if we define R(t,) to denote the number of cases 


that are at risk of experiencing an event as time ¢,, that 
is, the “risk set,” then the probability that the jth case 
will fail at time 7; is given by 


Bx; 


e 
Wie: 


JeR(t) 


Pr(t, = 7)|R(;)) = 


where the summation operator in the denominator is summing 
over all individuals in the risk set. Taking the product 
of the conditional probabilities in (3.1) yields the 
partial likelihood enaenes (a similar example can be 


found in [5], [6]), 


6; 
k eh% 


Le 
al > yh 
JER(t) 


with corresponding log-likelihood function, 
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k " 
log 1, =a) ste Eo | 4322) 


i=l JeR(t) 
By maximizing the log-likelihood in (3.2), estimates of 


the # may be obtained. What is the importance of this 


result? 
e Specifying the baseline hazard, A(t) is 
unnecessary. 
e The interval between events does not inform the 
partial likelihood function. 
° Censored cases contribute information only 


pertinent to the risk set (i.e. the denominator, 
not the numerator) 
The critical thing here is to note that no assumptions 
about the shape of the baseline hazard need to be made. 
Another way to see this is to think about the heuristic 
partial likelihood function above. All we need to know to 
compute a probability is yw (orexp(#'x,)). 

Cox demonstrated that maximum partial likelihood 
estimation produces parameter estimates that have the same 
properties as maximum likelihood estimates. This is 
convenient because under the same set of regularity 


conditions as maximum likelihood estimation the parameter 
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estimates from partial likelihood are asymptotically 
normal, asymptotically efficient, consistent, and 
invariant. So the usual kinds of hypothesis tests 
discussed in the context of parametric models are directly 
extended to the Cox model. The first step in applying the 
results of the Application to the primary biliary 
cirrhosis data is to order the survival times from 
smallest to largest. Appendix G shows an example of this 
data. The partial likelihood for # is now formed by taking 


the product over all failure points to give 


7 exp(z,f) 
ne =[] S* exp(z,f) 


1ER(t,) 

The partial iMetkoed is not a likelihood in the 
usual sense in that the general construction does not give 
a result that is proportional to the conditional of 
marginal probability of any observed event. This is an 
example of a partial likelihood to be found in Appendix G. 


PL = CEO CEO CEC EOICES 


Be 


d 


d(PL) 


(6e* +9)’ (Se +9)(4e” +9)(4e” +8)(4e” +6) 


(6e? +9)(5e” +9)(4e” +9) (4e” +8)(4e” +6) 


2e*? 


1 


36447 oF 


1 


5e*F eh 


2(6e” +9)(5e" +9) (4e" +9)(4e" +8)(4e” +6) 


1 


2e*eF 


1 


2e¥e 


(6c? +9)(5e +9)(4e” +9)(4e" +8) (4e" +6) 


1 


2e** ef 


1 


26% 64 


See 
(4e” +5)’ (4c? +4)(3e" +4) (ce? +4) 


2 eF 


1 


3e*%e 


1 — 


= 0.2875 
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(3.23) 


PL=0.2875 if #=0 The first step in applying the results of 
the data is to order the survival times from smallest to 
largest with the additional convention that failure times 
precede censored times. An efficient computer solution to 
the problem would require essentially the same 

organization of the data set. In general, there is 
advantage to begin the calculation at the last failure 

time since the risk set can then be formed by adding the 


labels of items failing or censored. 


Information Matrix: 
Fisher information is a key concept in the theory of 


statistical inference and is defined in the following 
manner: Let X=(X,..,X,) be a random sample, and let f(X|0) 
denote the probability density function for some model of 
the data, which has parameter vector #=(6,..,4,). Then the 
Fisher information matrix J,(@) of sample size n is given by 
the kxk symmetric matrix whose ij-th element is given by 


the covariance between first partial derivatives of the 
log-likelihood, 


1(0),.=C. 
n(9),,, = Co a0, 06, 


u J 


; dln f(X|6) ae Ft 
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An alternative, but equivalent, definition for the Fisher 
information matrix is based on the expected values of the 
second partial derivatives, and is given by 


2 In f(X|6) 
1,(0),, =-E| —-—* |. 
06,00, 
Strictly, this definition corresponds to the expected 
Fisher information. If no expectation is taken we obtain a 
data-dependent quantity that is called the observed Fisher 
information. As a simple example, consider a normal 


distribution with mean mw and variance o’, where 0=(p,0").. 


The Fisher information matrix for this situation is given 


10 
Oo 
by: L@)= 
0 
20% 


It is worth noting two useful properties of the 
Fisher information matrix. Firstly, [,(@)=nl,(@), meaning that 
the expected Fisher information for a sample of a 
independent observations is equivalent to a times the 
Fisher information for a single observation. Secondly, it 
is aseanaeue on the choice of parameterization. Suppose 
the parameter @ is changed into another parameter 


7=(),+5%,) with 7,=g,(0) where g, is one-to-one so its 
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inverse g'(n,)=6, exists. The Fisher information I'(n) for 
the new parameterization is obtained using the chain rule 


I'(n)=J(ny 1,(0M))J(y), where J(j) is the Jacobian matrix with 
elements J(7),; = ag" (n,)/ On, (i, j =1,...) : 

Let 7T(X) be any statistic and let yw(@) be its 
expectation Pe that w(@)=E[T(X)|. Under some regularity 
somata it follows that for all 0, 


var(T(X)) = [ 


— 3.4 
7,0) ae 


The value of the right hand side of (3.4) is known as the 


Information inequality lower bound. In particular, if T(X) 


is an unbiased estimator for @, then the numerator becomes 


1, and the lower bound is simply . Note that this 


1 
7, (6) 


explains why /,(@) is called the “information” matrix: The 
larger the value of [,(@) is, the smaller the variance 


becomes, and therefore, we would be more certain about the 
location of the unknown parameter value. The information 
inequality generalizes to the multi-parameter case, where 


0=(6,...6,).Let the statistic W(X) be an estimator for some 


function g(@). Then the inequality states that 
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Var(W(X)) = (0) 1,(0)'v(@) where 7(@) is a kxl column vector with 
elements (0), =0g(@)/00,. The Asymptotic Theory involves the 


maximum likelihood estimator that has many useful 
properties, including re-parametrization-invariance, 
consistency, and sufficiency. Further, it follows under 


some regularity conditions that the sampling distribution 


A 


of a maximum likelihood estimator 6, is asymptotically 


unbiased and also asymptotically normal with its variance- 
covariance matrix obtained from the inverse Fisher 
information matrix of sample size 1, that is 

6, > N(0,1,(0)'/n) as n goes to infinity. The Fisher 
information matrix also arises in Bayesian inference. 

The log partial likelihood ratio test is not only the 
easiest test to compute, but is also the best of the three 
tests for assessing the significance of the fitted model. 
The computation of information matrix tests for the 
multiple proportional hazards regression model requires 
matrix calculations. Specifically, we denote the vector of 
first partial derivatives whose elements are given as u(f). 
Under the hypothesis that all coefficients are equal to 
zero, and under the mathematical conditions needed for the 


partial likelihood ratio test, the vector of scores 
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u(0)=u()).,g will be distributed as multivariate normal with 
mean vector equal to zero and covariance matrix given by 
the information matrix evaluated at the coefficient vector 


equal to zero, I(0)=1(f)|,_, « The elements in this matrix are 


obtained by evaluating the expressions with the 
coefficient vector equal to zero. The score test statistic 
is 

u'(0)[Z(0)J"u(0) , 
which is distributed asymptotically as chi-square with n 
degrees-of-freedom. This statistic can be used to test the 


null hypothesis #=0 by using a chi-square test. 
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CHAPTER FOUR 


PRIMARY BILIARY CIRRHOSIS DATA 


Primary biliary cirrhosis is a disease characterized 
by inflammatory destruction of the small bile ducts within 
the liver. Primary biliary cirrhosis eventually leads to 
cirrhosis of the liver. The cause of primary biliary 
cirrhosis is unknown, but because of the presence of auto-. 
antibodies, it is generally thought to be an auto-immune 
disease. Other etiologies, such as infectious agents, have 
not been completely excluded. Primary biliary cirrhosis 
has a worldwide prevalence of approximately 5/100,000 and 
an annual incidence of approximately 6/1,000,000. The 
prevalence and incidence appear to be similar in different 
regions of the world. About 90% of patients with primary 
biliary cirrhosis are women. Most commonly, the disease is 
diagnosed in patients between the ages of 40 and 60 years. 
(See, [13]). 

This data set is a follow-up to the original primary 
biliary cirrhosis data set. “Primary biliary cirrhosis: 
prediction of short-term survival based on repeated 
patient visits.” The data from the Mayo Clinic trial in 


primary biliary cirrhosis of the liver conducted between 
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1974 and 1984 contains a description of the clinical 
background for the trial and the covariates. A total of 
418 primary biliary cirrhosis patients, referred to Mayo 
Clinic during that ten-year interval, met eligibility 
criteria for the randomized placebo controlled trial of 
the drug D-penicillamine. The first 310 cases in the data 
set participated in the randomized trial and contain 
largely complete data. The additional 108 cases did not 
participate in the clinical trial, but consented to have 
basic measurements recorded and to be followed for 
survival. Six of those cases were lost to follow-up 
shortly after diagnosis, so the data here are on an 
additional 102 cases as well as the 310 randomized 
participants. 

The data contains only baseline measurements of the 
laboratory parameters. This data contains multiple 
laboratory results, but only on the first 310 patients. 
Some baseline data values in this file differ from the 
original primary biliary cirrhosis file, for instance, the 
data errors in prothrombin time and age which were 
discovered after the original analysis, during research 


work on dfbeta residuals. Another major difference is that 
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there was significantly more follow-up for many of the 
patients at the time this data was assembled. 

One “feature” of the data deserves special comment. 
The last observation before death or liver transplant 
often has many more missing covariates than other data 
rows. The original clinical protocol for these patients 
specified visits at 6 months, 1 year, and annually 
thereafter. At these protocol visits lab values were 
obtained for a large pre-specified battery of tests. 
‘Extra” visits, often undertaken because of worsening 
medical condition, did not necessarily have all this lab 
work. The missing values are thus potentially informative, 
and violate the usual “missing at random” assumptions that 
are assumed in analyses. Because of the earlier published 
results on the Mayo primary biliary cirrhosis risk score, 
however, the 5 variables involved in that computation were 
usually obtained, i.e. age, bilirubin, albumin, 
prothrombin time, and edema score. The variables used 
were: Case number; Number of days between registration and 
the earlier of death, trans-plantation, or study analysis 
time; Status: O=alive, 1l=transplanted, 2=dead; Drug: l= D- 
penicillamine, O=placebo; Age in days, at registration; 


Sex: O=male, i=female; Day: number of days between 
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enrollment and this visit date, remaining values on the 
line of data refer to this visit; Serum bilirubin n mg/dl; 
Serum cholesterol in mg/dl; Albumin in gm/dl; Alkaline 
phosphates in u/liter; SGOT in u/ml (serum glutamic- 
oxaloacetic transaminase, the enzyme name has subsequently 
changed to “ALT” in the medial literature); Platelets per 
cubic ml / 1000; Prothrombin time in seconds; Histologic 
stage of disease. We used EPI-Info to eslewiace Kaplan- 
Meier for 312 patients Primary Biliary Cirrhosis by Gender 
for Age, Albumin, Alkaline, Bili, Platelets, and Spiders 
(See Appendix J). The smaller the p-value is, more changes 
can be seen affecting the outcomes. The larger the p-value 
is, covariates are not significant. In the outcome for 312 
patients by Gender there was a noticeable change. In the 
outcome for 312 patients by Drug in there was not a 
noticeable change (See Appendix L). The smallest p-value 
was shown for Age, Albumin, Alkaline, and Bili. The 


Coefficient (f) for Gender was f=-.0804, based on 
h(t)=h,(e™ indicating that the hazard function for female 


is smaller (sex=1), and the male (sex=0) hazard ratio 
could be lower with 95 percent confidence (See Appendix 


H). The nonparametric survival plot for follow up days by 
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Placebo and Penicillamine for 312 patients is illustrated 
in Appendix K on the Kaplan-Meier curve for all patients 
n=312. 

We used the nonparametric survival plot for follow up days 
by gender for 312 patients. The plot shows the survival 
curves for all categories for Primary Biliary Cirrhosis 
(See Appendix H). The result of Minitab calculations are 
in Appendix I. 

We used EPI-Info to calculate Primary Biliary 
Cirrhosis by drug for Age, Albumin, Alkaline, Bili, 
Platelets, and Spiders for 312 patients (See Appendix L). 
In the outcome for 312 patients by Drug there was no 
noticeable change. The result in Appendix L show that the 
hazard ratio for drugs is not noticeably different. The 
hazard rate for the drug was 0.9775. This difference could 


be due to the non-linear effect of the drug itself. 
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APPENDIX A 


DATA OF PRIMARY BILIARY CIRRHOSIS I 
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FU Days 
321 
552 
691 
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877 
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939 
1487 
1746 
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APPENDIX B 


CALCULATION OF THE 
KAPLAN-MEIER ESTIMATE 
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Time (f) 
321 
552 
691 
769 
877 
890 
939 
1487 
1746 
2033 
2386 
2400 
2576 
2689 
2812 
3069 


Rank i 
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(n-r)/(n—-r+}) 


15/16 
14/15 


12/13 


10/11 
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S(t) 
0.938 
(0.938)(0.933) = 0.875 


(0.875)(0.923) = 0.808 
(0.808)(0.909) = 0.734 


(0.734)(0.889) = 0.653 
(0.653)(0.875) = 0.571 


(0.571)(0.833) = 0.476 
(0.476)(0.800) = 0.381 


(0.381)(0.667) = 0.254 
(0.254)(0.500) = 0.127 
0 
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CALCULATION OF THE MINITAB 
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at Number 


Time 


322 


769 


1487 


2400 


2812 


8 


6 
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Kaplan-Meier Estimates 


Number 


Survival Standard 
Risk Failed Probability 


d. 


1 


al 


Upper 
0.875000 
1.00000 
0.729167 
1.00000 
0.546875 
0.94000 
0.364583 
0.75675 
0.000000 
0.00000 
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95.0% Normal CI 


Error 
0.116927 
0.164976 
0.200580 
0.200086 


0.000000 


Lower 


0.645828 


0.405819 


Os: 53745 


0.000000 


0.000000 


APPENDIX D 
GRAPH OF NONPARAMETRIC SURVIVAL 


PLOT FOR FOLLOW UP DAYS 
FOR 15 PATIENTS 
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Nosiparametric Survival Plot for Follow Up Days - 
"Kaplan-Meier Method.” 
Censoring Column. in Censor 


Sex 
—- Male 
—-— Female 


Table of Statistics 

Mean Median IQR 
2398.13 2689 3067 
1523.33 1427 * 


1000 ~ .. 2000: 3000 
' Follow Up Days 
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APPENDIX E 
CALCULATION OF THE 


LOG-RANK TEST 
FOR 16 PATIENTS 
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APPENDIX F 


PRIMARY BILIARY CIRRHOSIS 
FOR 15 PATIENTS 
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Patient ID FUDays Status Censor Drug Age Sex Asictes Platelets 
1 10 51 2 0 2 25772 1 1 302 
2 23 264 2 0 2 20442 1 1 214 
3 97 611 2 0 2 26259 0 0 344 
4 149 762 2 0 1 22574 0 0 140 
5 295 877 1 1 4 12912 0 0 306 
6 3 1012 2 0 1 25594 0 0 154 
7 14 1217 2 0 2 20535 0 1 156 
8 148 1427 2 0 2 11273 1 0 330 
9 8 2466 2 0 2 19379 1 0 373 
10 190 2504 0 1 1 19916 1 0 327 
14 90 2689 2 0 1 12227 0 0 337 
12 21 3445 0 4 2 23445 0 0 336 
13 16 3672 0 1 2 14772 1 0 198 

14 24 4079 2 0 1 16261 0 0 70 
15 66 4191 2 0 1 16967 0 0 123 
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APPENDIX G 
PROPORTIONAL HAZARDS MODEL 


APPLIED TO PRIMARY 
BILIARY CIRRHOSIS 
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Contribution to 


Patient FUDays Sex Censored Likelihood 

ef 

1 51 1 - 6e% +9 
ef 

2 264 1 Se’ +9 
) 1 

3 611 0 | 4e7 +9 

: a 

4 762 0 0(877) def +8 
| } 

6 1012 0 he? +6 
| 

7 1217 0 Age as 
| . ~ 

8 1427 1 | . eh ca 
: ef 

9 2466 1 1(2504) | 3e% +4 
| 1 

11 2689 O  0(3445), 1(3672) le? +4 

14 4079 f° * e442 
_ 1 


15 4191 G.  « Oe? +1 


60 


APPENDIX H 
NONPARAMETRIC SURVIVAL PLOT 


FOR FOLLOW UP DAYS BY GENDER 
FOR 312 PATIENTS 
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Nonparametric Survival Plot for Fol low Up Days 
Kaplan-Meier Method 
Censor ing Column in Censor 


Sex 
— ‘Male 
| meses Female 


Table of Statistics 
Mean Median IOR 
2404.23 2386 3179 
2773.30 3428 = =«* 


“2000 30004000 sooo 
Follow Up Days 
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APPENDIX I 
MINITAB CALCULATIONS: 


PRIMARY BILIARY CIRRHOSIS BY GENDER 
FOR 312 PATIENTS 
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Kaplan-Meier Estimates 


Number 
at Number Survival Standard 95.0% Normal CI 
Time Risk Failed Probability Error Lower Upper 
41 275 1 0.996364 0.0036297 0.989249 1.00000 
51 274 1 0.992727 0.0051239 0.982685 1.00000 
71 273 1 0.989091 0.0062639 0.976814 1.00000 
77 272 1 0.985455 0.0072196 0.971304 0.99960 
110 271 1 0.981818 0.0080569 0.966027 0.99761 
130 270 1 0.978182 0.0088095 0.960915 0.99545 
131 269 1 0.974545 0.0094977 0.955930 0.99316 
179 268 1 0.970909 0.0101345 0.951046 0.99077 
186 267 1 0.967273 0.0107291 0.946244 0.98830 
198 266 1 0.963636 0.0112882 0.941512 0.98576 
207 265 1 0.960000 0.0118168 0.936840 0.98316 
216 264 1 0.956364 0.0123188 0.932219 0.98051 
223 263 1 0.952727 0.0127974 0.927645 0.97781 
264 262 2 0.945455 0.0136941 0.918615 0.97229 
304 260 1 0.941818 0.0141160 0.914151 0.96948 
321 259 1 0.938182 0.0145223 0.909719 0.96664 
326 258 1 0.934545 0.0149143 0.905314 0.96378 
334 257 1 0.930909 0.0152932 0.900935 0.96088 
348 256 1 0.927273 0.0156598 0.896580 0.95797 
388 255 1 0.923636 0.0160150 0.892248 0.95503 
400 254 1 0.920000 0.0163596 0.887936 0.95206 
460 253 1 0.916364 0.0166942 0.883644 0.94908 
515 252 1 0.912727 0.0170194 0.879370 0.94608 
549 251 1 0.909091 0.0173357 0.875114 0.94307 
597 250 1 0.905455 0.0176436 0.870874 0.94004 
673 249 1 0.901818 0.0179436 0.866649 0.93699 
694 248 1 0.898182 0.0182360 0.862440 0.93392 
708 247 1 0.894545 0.0185211 0.858245 0.93085 
733 245 1 0.890894 0.0188020 0.854043 0.92775 
750 243 1 0.887228 0.0190787 0.849834 0.92462 
769 242 1 0.883562 0.0193489 0.845639 0.92148 
786 241 1 0.879896 0.0196129 0.841455 0.91834 
790 240 1 0.876229 0.0198709 0.837283 0.91518 
797 239 1 0.872563 0.0201231 0.833123 0.91200 
824 238 1 0.868897 0.0203698 0.828973 0.90882 
850 235 1 0.865199 0.0206160 0.824793 0.90561 
853 234 1 0.861502 0.0208568 0.820623 0.90238 
859 233 1 0.857805 0.0210925 0.816464 0.89915 
904 231 1 0.854091 0.0213255 0.812294 0.89589 
930 230 1 0.850378 0.0215537 0.808133 0.89262 
943 228 1 0.846648 0.0217795 0.803961 0.88933 
971 227 1 0.842918 0.0220006 0.799798 0.88604 
974 226 1 0.839189 0.0222171 0.795644 0.88273 
980 225 1 0.835459 0.0224293 0.791498 0.87942 
1000 223 1 0.831712 0.0226394 0.787340 0.87608 
1037 221 1 0.827949 0.0228476 0.783168 0.87273 
1080 219 HE 0.824168 0.0230540 0.778983 0.86935 
1083 218 1 0.820388 0.0232561 0.774807 0.86597 
1165 214 1 0.816554 0.0234613 0.770571 0.86254 
1170 213 1 0.812721 0.0236623 0.766343 0.85910 
1191 212 2 0.805053 0.0240521 0.757912 0.85219 
1212 210 1 0.801220 0.0242412 0.753708 0.84873 
1235 205 1 0.797311 0.0244360 0.749418 0.84521 
1350 195 1 0.793223 0.0246504 0.744909 0.84154 
1356 194 1 0.789134 0.0248601 0.740409 0.83786 
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1413 
1427 
1434 
1444 
1487 
1492 
1576 
1657 
1690 
1741 
1786 
1827 
1847 
1925 
2055 
2081 
2090 
2105 
2224 
2256 
2288 
2297 
2400 
2419 
2466 
2503 
2540 
2583 
2598 
2769 
2847 
3086 
3090 
3170 
3222 
3244 
3282 
3358 
3428 
3445 
3574 
3584 
3762 
3839 
3853 


188 
185 
183 
180 
175 
174 
167 
162 
159 
154 
148 
145 
142 
137 
128 
127 
126 
125 
114 
111 
109 
107 
98 
97 
92 
89 
85 
77 
76 
66 
62 
52 
51 
45 
44 
42 
40 
37 
34 
33 
31 
28 
24 
22 
20 


PRPPRBEPPRPPEPPRPPEPEPEPPRPPEBEBRPPRPEBPEPPEBPEPPBPPBPRPENBPBEPREE 


Oe Re eee? oe le oe elon ehe nol ooo e heehee koe Ronee nooo ooo eno mene nomenenone) 


- 784936 
- 780693 
- 776427 
-772114 
- 767702 
763290 
- 758719 
- 754036 
~ 744551 
- 739716 
- 734718 
-729651 
724513 
©719224 
- 713605 
- 707986 
- 702367 
- 696749 
- 690637 
- 684415 
- 678136 
-671798 
- 664943 
- 658088 
- 650935 
- 643621 
- 636049 
- 627788 
- 619528 
-610141 
- 600300 
-588756 
-577212 
-564385 
-551558 
-538426 
524965 
-510777 
-495754 
-480731 
-465224 
- 448608 
-429916 
-410375 
-389856 


oooooooo0co0occecodaoeoocoocoooo0ooco0c0qococo0cocooaococooco0coceococecocao00dg0co ccc °c 090 


-0250797 
-0253005 
0255194 
-0257396 
0259679 
-0261908 
. 0264298 
0266783 
-0271727 
0274230 
-0276894 
0279582 
0282296 
0285146 
-0288401 
0291553 
-0294604 
-0297557 
-0301158 
-0304805 
-0308408 
- 90311970 
- 0316228 
-0320312 
-0324719 
-0329204 
0333926 
-0339653 
-0345082 
- 0352389 
-0360185 
0371298 
0381542 
-9394035 
- 0405420 
-0416493 
-0427279 
-0438656 
-0450746 
-0461443 
-0471896 
. 0483409 
- 0498096 
0512357 
-0526224 
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-735781 
731105 
- 726410 
721665 
. 716806 
~711957 
. 706918 
- 701747 
691293 
-685968 
- 680448 
-674854 
-669184 
- 663337 
- 657080 
- 650843 
- 644626 
- 638428 
-631611 
-624674 
- 617689 
-610653 
- 602963 
-595308 
-587291 
-579098 
-570601 
-561218 
-551893 
-541074 
-929705 
-515983 
-502431 
-487155 
-472097 
-456794 
- 441220 
-424802 
-407409 
- 390290 
-372734 
- 353862 
332291 
-309955 
-286718 


ooooooocooocon0cecococo0co0odc0nodco0odg0ce0c0nec0oaococoecoccecenoco0n0 eocoeo0o0o°0o0 0 


- 83409 
- 83028 
82644 
82256 
81860 
- 81462 
- 81052 
80632 
-79781 
79346 
- 78899 
- 78445 
- 77984 
-77511 
. 77013 
- 76513 
76011 
- 75507 
- 74966 
74416 
- 73858 
- 73294 
- 72692 
72087 
. 71458 
- 70814 
- 70150 
- 69436 
- 68716 
-67921 
- 67090 
-66153 
-65199 
-64161 
- 63102 
- 62006 
- 60871 
-59675 
-58410 
.57117 
-55771 
-54335 
-52754 
-51079 
-49299 


APPENDIX J 


EPI-INFO CALCULATIONS: 
PRIMARY BILIARY CIRRHOSIS BY GENDER 
FOR 312 PATIENTS 
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95% ° C.. 


Term Hazard Ratio Coefficient’ S. E. _ Z-Statistic P-Value 
Sex(Yes/No) 0.9228 0.4903 1.7367 -0.0804 . 0.3227 -0.2492 0.8032 
Age 1.0001 1.0001 1.0002 0.0001. 0.0 4.2321 0.0 
Albumin 0.3024 0.1798 0.5086 -1.196 . 0.2653 -4.5083 0.0 
Alkaline 1.0051 1.002 1.0081 0.0051 0.0016 3.2592 0.0011 
Bili 1.4099 13181 1508 0.3435 | 0.0343 10.0035 0.0 
Platelets 0.9982 0.9963 1.0001 -0:0018 0.001 -1.8612 ° 0.0627 

Spiders 1.4712 0.9356 2.3135 0.3861 0.231 1.6717 0.0946 
Convergence: Diverged 
Iterations:. 2 


-2 * Log-Likelihood: 1357.5497 * 


Statistic D.F. P-Value 


Test 
Score 261.9749 7 0.0 


Likelihood Ratio -104.5501 7 1.0 
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APPENDIX K 
NONPARAMETRIC SURVIVAL PLOT 


FOR FOLLOW UP DAYS BY DRUG 
FOR 312 PATIENTS 
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‘Nonparametric Survival Plot for Follow Up Days 
Kaplan-Meier Method _ . 
Censoring Column in Censor 


Drug 
—- Placebo 
— — Penicillamine |: 


Table of Statistics 
Mean Median IQR 
2746.18 3428 = * 
2833.04 3282 * 


2000 3000 
Follow Up Days 


4000 
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APPENDIX L 
EPI-INFO CALCULATIONS: 


PRIMARY BILIARY CIRRHOSIS BY DRUG 
FOR 312 PATIENTS 
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Term Hazard Ratio 95% CI. Coefficient S.E. Z-Statistic P-Value 
Drug(Yes/No) 0.9775 0.683 1.3991 -0.0227 0.1829 -0.1242 0.9011 
Age 1.0001 1.0001 1.0002 0.0001 0.0 4.2723 0.0 
Albumin 0.3059 0.1831 Q.5111 -1.1844 0.2619 -4.5231 0.0 
Alkaline 1.0052 1.0024 1.0081 0.0052 0.0014 3.6178 0.0003 
1.3176 
0.9963 


Bili 1.4092 1.5073 0.343 0.0343 9.9969 0.0 
Platelets 0.9981 1.0 -0.0019 0.001 -1.9534 0.0508 
Spiders 1.4579 0.9318 2.2813 0.377 0.2284 1.6505 0.0988 
Convergence: Diverged 

Iterations: 2 


-2 * Log-Likelihood: 1358.8139 


Test Statistic D.F. P-Value 
Score 261.9283 7 0.0 
Likelihood Ratio -105.8143 7 1.0 
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